Normalization and Paraphrasing Using Symbolic Methods

نویسندگان

  • Caroline Brun
  • Caroline Hagège
چکیده

We describe an ongoing work in information extraction which is seen as a text normalization task. The normalized representation can be used to detect paraphrases in texts. Normalization and paraphrase detection tasks are built on top of a robust analyzer for English and are exclusively achieved using symbolic methods. Both grammar development rules and information extraction rules are expressed within the same formalism and are developed in an integrated way. The experiment we describe in the paper is evaluated and presents encouraging results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paraphrasing 4 Microblog Normalization

Compared to the edited genres that have played a central role in NLP research, microblog texts use a more informal register with nonstandard lexical items, abbreviations, and free orthographic variation. When confronted with such input, conventional text analysis tools often perform poorly. Normalization — replacing orthographically or lexically idiosyncratic forms with more standard variants —...

متن کامل

Comparison of Count Normalization Methods for Statistical Parametric Mapping Analysis Using a Digital Brain Phantom Obtained from Fluorodeoxyglucose-positron Emission Tomography

Objective(s): Alternative normalization methods were proposed to solve the biased information of SPM in the study of neurodegenerative disease. The objective of this study was to determine the most suitable count normalization method for SPM analysis of a neurodegenerative disease based on the results of different count normalization methods applied on a prepared digital phantom similar to one ...

متن کامل

Statistical Machine Translation on Paraphrased Corpora

This paper presents a statistical machine translation trained on normalized corpora. The automatic paraphrasing is carried out by inducing paraphrasing expressions from a bilingual corpus. Then, the normalization is treated as a specific paraphrase of a given input determined by the frequency in a corpus. The experimental results on Japanese-to-English translation with normalized English corpus...

متن کامل

Gathering and Generating Paraphrases from Twitter with Application to Normalization

We present a new and unique paraphrase resource, which contains meaningpreserving transformations between informal user-generated text. Sentential paraphrases are extracted from a comparable corpus of temporally and topically related messages on Twitter which often express semantically identical information through distinct surface forms. We demonstrate the utility of this new resource on the t...

متن کامل

Dependency-based paraphrasing for recognizing textual entailment

This paper addresses syntax-based paraphrasing methods for Recognizing Textual Entailment (RTE). In particular, we describe a dependency-based paraphrasing algorithm, using the DIRT data set, and its application in the context of a straightforward RTE system based on aligning dependency trees. We find a small positive effect of dependency-based paraphrasing on both the RTE3 development and test...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003